San Francisco, ‘The City’ like all cities, has an interest in using data to understand social patterns, such as crime. Crimes of particular interest to both visitors and residence alike are those against persons or their property. This report seeks to visualize when and where crimes in the City are more likely happens against persons or the theft their of property. It uses data from the summer of 2014 giving time and location crime reports from across all of San Francisco’s neighborhoods. The goal is to visually expose how incidents vary by time of day, day of the week or month and what is suggested broadly about the differences in these general patterns in San Francisco. The findings are summarized at the end of the report.
The data set contains 28993 records June 1, 2014 to August 31, 2014. Data was sourced from SFPD Incidents from the old system. San Francisco police have implemented a new system for tracking crime. Also note that new Police Station boundaries are not reflected in the dataset and cannot compare data from July 19, 2015 onward. Each row represent a criminal record including variables composed of the following data elements:
First, the data was read. Data was processed for the three month period provided using a Pareto chart (a barplot where the categories are ordered in non increasing order, adding a line to show the cumulative sum) from the Quality Control Charts package(qcc). Below represents the cumulative summation of over 85% of the crimes in The City.
Of the 37 categories of crime in this dataset 30 categories represent less than 18% of total crimes reported. Further, both ‘Other Offenses’ and ‘Non-Criminal’ categories (over 22% of total) appeared too broadly applied in my view to extract meaningful patterns. For example, ‘Other Offenses’ includes items from MISCELLANEOUS INVESTIGATION to TRAFFIC VIOLATION to OBSCENE PHONE CALLS(S). Also, ‘Non-Criminal’ included curious descriptions of ‘INVESTIGATIVE DETENTION.’ Without clarification I was unsure what offense was being tracked. I selected 3 of the 6 left to focus on crimes, representing almost half of the total crimes reported: ‘Larceny/Theft,’ ‘Assault,’ and ‘Vehicle Theft’ offenses.
The goal is to analyze criminal incident data from San Francisco to visualize patterns of activity for a small set of data for the summer of 2014. To visualize when and where crime in the City happens against persons or their property a few visualizations that can suggest a general pattern of activity. The first spacial plots use contour maps to graphically represent the data, where each value contained in the map are marked in colors. These maps are used to explore the relationships among location or time and category.
The following visualizations are offered for review.
This plot uses ggmap to create a contour map overview of the locations of the 3 activities. Note, the northeastern corner of the City and the Mission are the most densely populated area are also the area that has the most criminal activity. (See Appendix A) The central downtown including the Tenderloin district appears to be the epicenter of activity where the contours peak, represented by the highest density contours for all categories.
###########################################################
# Focus on major crimes of concern
SF_crimes <- subset(SFCrime,
Category != "OTHER OFFENSES" &
Category != "NON-CRIMINAL"
)
# rank SF crimes
SF_crimes$Category <-
factor(SF_crimes$Category,levels = c( "VEHICLE THEFT", "LARCENY/THEFT","ASSAULT")
)
###########################################################
# Plot 1 with ggmap
###########################################################
# get a color map)
SFMap <- get_map("San Francisco", zoom = 12)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=San+Francisco&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=San%20Francisco&sensor=false
SFMap <- ggmap(SFMap, extent = "normal", legend = "right")
# a filled contour plot...
SFMap +
stat_density2d(aes(x = X, y = Y, colour = Category, fill = ..level.., alpha = ..level..), size = .75, data = SF_crimes, geom = "polygon") +
scale_fill_gradient("Crime\nDensity") +
ggtitle("SF Crime Contour Map") +
scale_alpha(range = c(.2, 1), guide = FALSE) +
guides(fill = guide_colorbar(barwidth = 1, barheight = 5))
The vehicle theft contour lines are shown a to be wide spread at low level of activity throughout the City, contrast with an unusually high activity through the Mission District and a finger of increased activity to the southern neighborhoods to the Outer Mission. Vehicle theft is generally less dense during July and seems to abandon a good part of the western side of the City. This is typically when the Bay Area surf is down and the City and Ocean Beach at its westend, in particular, are foggy.
Looking at the contour map of both Larceny/Theft and Assault are very consentrated in the Downtown area with modest fluctuations of Larceny/Theft in the smaller North Beach tourist area and constant Assault activity at the southeastern end of the City at Bayview/Hunters Point.
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=San+Francisco&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=San%20Francisco&sensor=false